Apis mellifera filamentous virus from a honey bee gut microbiome survey in Hungary

In Hungary, as part of a nationwide, climatically balanced survey for a next-generation sequencing-based study of the honey bee (Apis mellifera) gut microbiome, repeated sampling was carried out during the honey production season (March and May 2019). Among other findings, the presence of Apis mellifera filamentous virus (AmFV) was detected in all samples, some at very high levels. AmFV-derived reads were more abundant in the March samples than in the May samples. In March, a higher abundance of AmFV-originated reads was identified in samples collected from warmer areas compared to those collected from cooler areas. A lower proportion of AmFV-derived reads were identified in samples collected in March from the wetter areas than those collected from the drier areas. AmFV-read abundance in samples collected in May showed no significant differences between groups based on either environmental temperature or precipitation. The AmFV abundance correlated negatively with Bartonella apihabitans, Bartonella choladocola, and positively with Frischella perrara, Gilliamella apicola, Gilliamella sp. ESL0443, Lactobacillus apis, Lactobacillus kullabergensis, Lactobacillus sp. IBH004. De novo metagenome assembly of four samples resulted in almost the complete AmFV genome. According to phylogenetic analysis based on DNA polymerase, the Hungarian strains are closest to the strain CH-05 isolated in Switzerland.

Honey bees (Apis mellifera) are important pollinators with high economic value and ecosystem importance, [1][2][3] and are exposed to confined environments, and several factors threaten their health, including various pathogens, parasites, and chemicals used as pesticides in agriculture [4][5][6] .The global decline of this important pollinator poses a threat to food security and biodiversity conservation 7 .The composition of the honey bee's normal or altered microbiota, for which the available knowledge is limited, may also affect their body function.Although there are studies on honey bee gut microbiota [8][9][10] , there is little evidence on the environmental factors that influence it.2][13][14][15] .The honey bee gut bacteriome has long been known to be composed of a few core bacterial species 11,16,17 .However, beyond the bacteriome, the viral composition of the microbiome is increasingly gaining more attention in bees 18 , as in humans 19,20 .Most of these studies understandably focus on bacteriophages as they play an essential role in shaping the composition of the bacteriome [18][19][20][21] .However, the viruses of the honey bee itself might be just as important.Especially considering those found in the gut and feces, which can contribute to their spread [22][23][24] .Many of these viruses are important pathogens 22,23 , while others, such as the Apis mellifera filamentous virus (AmFV), are little to not pathogenic to bees 25,26 .AmFV is the most significant DNA virus of the honey bee 25,26 and for a long time was the only known one 27 .This property of the virus is important because it allows us to investigate the relationship between the bacteriome and AmFV in metagenomic studies targeting bacteria.Despite the fact that the role of AmFV in various diseases is still uncertain, a better understanding of its ecology can bring us closer to understanding its role.
In 2019, a nationwide, climatically balanced survey was conducted in Hungary to investigate changes in the gut microbiome of honey bees based on next-generation sequencing (NGS).The bacteriome results of the study were published by Papp et al. (2022)  15 .In the course of the analyses, we found a large number of short reads originating from AmFV.In this paper, we present the results of a detailed analysis of the AmFV sequences found in the survey.

Bioinformatic and statistical analysis
Quality-based filtering and trimming were performed by Adapterremoval 28 using 20 as the quality threshold and only retaining reads longer than 50 bp.The remaining reads were taxonomically classified using Kraken2 (k=35) 29 with the NCBI non-redundant nucleotide database 30 .The taxon classification data was managed using functions of package phyloseq 31 and microbiome 32 .The abundance differences were analyzed by the DESeq2 package 33 .Analyzing the seasonal effect, a mixed-effect model was applied to handle the repeated measures by apiary as a random factor.The SparCC correlation coefficient quantified the relationship between the relative abundances of core microbiome species and AmFV. 34,35The core microbiome was defined with a relative abundance on species-level above 0.5% in at least one of the samples.The statistical tests were two-sided, and p-values less than 0.05 were considered significant.The cleaned reads were aligned to the AmFV genome (KR819915.2) by Bowtie2 36 with the very-sensitive-local setting.De novo assembly was carried out using MEGAHIT (v1.2.9) 37 , polishing of the contigs was performed with POLCA (v4.1.0) 38, and scaffolds were created by RagTag (v2.1.0) 39sing the AmFV genome (KR819915.2) 25 .The average nucleotide identity (ANI) of scaffolds compared to the genome KR819915.2was estimated by pyani (v0.2.12). 40For the genome annotation Prokka (v1.14.6) 41 was used guided by the genome KR819915.2.Predicted protein homology analysis was performed using the NCBI BLASTP (v2.14.0) 42 algorithm with a minimum e-value of 1.0e-5 on two reference genomes (KR819915.2,OK392616.1).Phylogenetic analysis was performed based on the amino acid sequences of the DNA polymerase gene.The genetree was constructed 43 based on multiple sequence alignment by MAFFT (v7.490) 44 .The best substitution model was selected by functions of phangorn package 45 based on the Bayesian information criterion.The generated neighbor-joining tree was optimized by the maximum likelihood method.Bootstrap values were produced by 100 iterations.All data processing and plotting were done in the R-environment 46

Results
The shotgun sequencing generated paired-end read counts of samples are ranging between 311,931 and 546,924, with a mean of 413,629.The OTU table, created by Kraken2 taxonomic classification, contained counts of samples ranging between 175,576 and 314,586 with a median of 262,292.The minimum, maximum, and median read counts of the samples assigned as viral-originated were 443, 72,010, and 1,074, respectively.The viral hits were dominated by reads matching the genome of AmFV.All of the samples contained reads from this species, and their relative abundance per sample is summarised in Fig 2 .AmFV-derived reads were more abundant in the March samples than in the May samples (fold change (FC): 5.53, 95%CI: 2.38-12.84,p<0.001).In March, a higher abundance of AmFV-originated reads was identified in samples collected from warmer areas compared to those collected from cooler areas (FC: 26.05, 95%CI: 7.31-92.81,p<0.001).A lower proportion of AmFV-derived reads were identified in samples collected in March from the wetter areas than in those collected from the drier areas (FC: 0.33, 95%CI: 0.13-0.8,p=0.014).The level of AmFV-reads found in samples collected in May showed no significant differences between groups based on either environmental temperature or precipitation (FC: 1.44, 95%CI: 0.61-3.39,p=0.40;FC: 1.85, 95%CI: 0.78-4.37,p=0.16).
De novo assembly of four samples (apiary ID 10, 16, 18 in March and 13 in May) resulted in almost the complete AmFV genome.For these samples, Table 1 shows the coverage and depth of the reads over the reference genome (KR819915.2) and statistics on the scaffolds and the ORFs identified within them.

Figure 2.
Relative abundance of Apis mellifera filamentous virus originated reads for the first (March) and second (May) sampling.The environmental condition, growing degree-day (GDD), and precipitation categories of sampling sites are also marked.

Table 1.
Alignment and assembly statistics.Columns 2 and 3 describe the coverage of reads on the reference genome and average depth.The lengths of the scaffolds created by the de novo assembly and the proportion of gaps within them are shown in columns 5 and 6.Column 7 shows the average nucleotide identity (ANI) of scaffolds estimated for the reference genome (KR819915.2).The last two columns show how many ORFs were predicted in the scaffolds generated and how many of these ORFs were predicted to have a protein product that matched a CDS product of the reference genome.Table 2 summarises the predicted proteins in the scaffolds generated from our samples that can be linked to products with predicted functions in the KR819915.2and/or OK392616.1 genomes.Figure 3 shows the genetree based on the amino acid sequences of the DNA polymerase gene with the best substitution model JTT+I.

Discussion
Even though samples presented in this study were taken from healthy bee specimens, AmFV was detected in all samples.This is in line with our current knowledge of the virus, according to which it is only pathogenic in acute cases and/or if the bee colony is under stress 26,47,48 .The virus is otherwise commonly prevalent, or even endemic in bee colonies, and besides its very likely oral-fecal transmission route, can possibly be spread transovarially from the queen to the workers 25 .
Table 2. ORF homologs correspond to genes with predicted functions in the KR819915.2and/or OK392616.1 genomes.The last four columns show the coverage and sequence identity of the product of the ORFs predicted in our scaffolds that match the product of the reference genome (KR819915.2) with the highest similarity.In contrast to other studies, we have observed a decline in AmFV abundances as the honey-producing season advanced.Bailey et al. observed an increasing prevalence of infected colonies from the beginning of autumn until the end of spring, which was followed by a steep decline 49 .Similarly, Hartmann and colleagues observed a significant difference in AmFV loads between autumn and springtime 26 .It is possible that regional climatic differences might influence the dynamics of viral loads.Further, large-scale surveys are necessary, however, to confirm such patterns.
Furthermore, the relationship of AmFV with the other components of the microbiome can provide us useful information.We found a significant positive correlation with AmFV for several bacteria.In some of these, the abundance of the bacterial species also decreased as the season progressed, as we have observed previously for L. apis and L. kullabergensis 15 .For other species, such as F. perrara, G. apicola and S. alvi, we have not observed a seasonal pattern in our previous analysis 15 .Among these, F. perrara is especially of interest, which is known to stimulate the immune system of the honey bees strongly 50 .This species, as a potential opportunistic pathogen 51 , may act as a stressor on the bees, thus explaining the observed positive correlation with AmFV.Significant negative correlation was only found for two newly described Bartonella species, B. apihabitans and B. choladocola 52 .Regarding the only previously known Bartonella species in bees, B. apis, our previous analysis of the same samples found an increase in abundance of this bacterial species with the progression of the season 15 .This might suggest that even these two newly described species may have shown similar changes, which may explain the negative correlation.
The reason for the seasonal peaks of AmFV or the association between the virus and other members of the honey bee microbiome is yet unknown.Furthermore, the effect of annually higher AmFV abundances on the immune system of bees needs to be determined.Moreover, other studies found correlations in the number of AmFV and other RNA viruses, such as Deformed Wing Virus (DWV) and Black Queen Cell Virus (BQCV) 26,53 , indicating the benefit that could be achieved if the microbiome could be analyzed as a whole.
Our results confirm the high prevalence of the virus, as previous studies suggested 26,49 .There were, however, 3 March samples (sample numbers 10, 16, and 18) in which we found an exceptionally high abundance of AmFV.This may be somewhat related to our observations on seasonality, as the virus was not present in such high numbers in the corresponding May samples.Accordingly, the abundance of the virus decreased as the season progressed.In contrast, however, sample 13 was dominated by AmFV in May and, therefore, may be of further interest.The spike in virus abundance may indicate the presence of some stressor on the colony, which has not yet caused observable symptoms in the animals.Bee viruses have been described, for example, to increase in abundance in response to certain pesticides in a concentration-dependent manner 54,55 , which is thought to be due to the immunomodulatory effect of pesticides 54 If indeed the emergence of stressors could be associated with such a fluctuation in viral abundance, it could also suggest the use of the virus as a health indicator and contribute to the diagnosis of various complex disease processes.
Besides their direct effect on honey bees, the spread of various microorganisms between A. mellifera or other bee species of economic importance and wild bees can be of particular economic and ecological significance 56 .The spillover of pathogens between different arthropod species has been observed for several microorganisms related to A. mellifera 57 , among which AmFV is no exception.It has been detected in several bee species and also in wasps, flies and beetles [56][57][58][59] .However, it is important to note that according to our current knowledge, AmFV does not show significant pathogenic effect on A. mellifera nor at the colony nor the individual level 26 , which does not necessarily mean that solitary wild bee species do not show susceptibility to the virus as well.Indeed, it is possible that these species may show increased sensitivity to certain adverse effects due to their solitary lifestyle, as it has already been shown for the effect of pesticides 60 .Furthermore, the positive association of the virus with BQCV 26 , which has also been detected in wild bee species 57 , may contribute to the potential harm caused by AmFV.Therefore it might be assumed that the ecological and economic importance of AmFV can be significantly determined by the impact the virus may have on wild bee species and their importance in pollination services 61 .
Based on the phylogenetic analysis, the sample from apiary 16, collected in March from the southwest of Hungary, had the highest similarity to the Swiss reference genome.Both other samples (apiary ID 10 March and 13 May) that contained the DNA-polymerase gene in full length were derived from Eastern Hungary.Even though sample 13 from May is relatively closer to the west of the country due to colony migration, it is originally from Eastern Hungary, and the permanent beekeeping premises are 76 air kilometers away from March sampling point 10.Accordingly, it can be supposed that the viral strain from sample 13 collected in May has already been present at the overwintering location and migrated to the May sampling point.It can be assumed that the mediating effect of the imports of honey and propolis from Hungary to Switzerland is responsible for the close genetic relationship of the Hungarian strains of AmFV.
More than one hundred ORFs were identified in each of four of our scaffolds that did not show any similarity to the reference genome with the homology assumptions.In a previous study, Yang and colleagues (2022) 62 predicted the functionality of newly detected ORFs from AmFV isolates.As our sequences were of metagenomic origins, the predicted products may derive from organisms other than AmFV despite our sequences' high similarity to the reference sequence.Accordingly, we have presented only those products that appeared in any of the two whole genomes.
In the present study, we report 4 new assemblies of the AmFV from Hungary.Our results may provide deeper insights into the genome organization of this virus.Furthermore, our results suggest a seasonal trend of the virus abundance, i.e., there is a decrease in it in the gastrointestinal tract of bees as the production season progresses.The high abundance of the virus in some bee colonies and its association with certain members of the microbiome, in particular F. perrara, may suggest a link between the virus and various stressors.It may, therefore, be important either as an agent of complex disease processes or as an indicator of the health status of the hive. .

Figure 1 .
Figure 1.Sampling points in Hungary.The green dots indicate migrating apiaries in May.The blue dots indicate migratory apiaries at the time of sampling in March, as well as non-migratory apiaries.The inset map shows the study region, Hungary in Europe, colored yellow.Neighboring countries are presented by ISO 3 character codes: Austria (AUT), Croatia (HRV), Romania (ROU), Serbia (SRB), Slovakia (SVK), Slovenia (SVN), Ukraine (UKR).